Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Add compatibility checks for Schemas with default values #11434

Merged
merged 6 commits into from
Oct 30, 2024

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented Oct 30, 2024

This updates Schema.checkCompatibility:

  • Adds a check that initial-default is not set for v1 and v2 tables, which would break forward compatibility
  • Updates the method to return a list of compatibility problems rather than one at a time
  • Adds tests for checkCompatibility in TestSchema

I think that this should get into 1.7.0 to avoid accidentally leaking initial defaults in v1 and v2 tables.

@github-actions github-actions bot added the API label Oct 30, 2024
@RussellSpitzer RussellSpitzer added this to the Iceberg 1.7.0 milestone Oct 30, 2024

@ParameterizedTest
@ValueSource(ints = {1, 2, 3})
public void testSupportedInitialDefault(int formatVersion) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testSupportedWriterDefault?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void testSupportedInitialDefault(int formatVersion) {
public void testSupportedWriterDefault(int formatVersion) {

@ParameterizedTest
@ValueSource(ints = {1, 2, 3})
public void testSupportedInitialDefault(int formatVersion) {
// only the initial default is a forward-incompatible change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We could make this self contained to the test by saying

Writer defaults are universally compatible

Or something?

@RussellSpitzer
Copy link
Member

Tests failing are because Metadata.json also has validations for this and this check is failing first

problems.put(
field.fieldId(),
String.format(
"Invalid type for %s: %s is not supported until v%s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing checks in TableMetadata look like

"Invalid type in v%s schema: struct.ts_nanos timestamptz_ns is not supported until v3"

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, dm'd a patch for the one broken test (Expected error message has changed)

And one required change for the test name in this PR which is I believe misnamed

@github-actions github-actions bot added the core label Oct 30, 2024
@rdblue
Copy link
Contributor Author

rdblue commented Oct 30, 2024

Updated. Thanks, @RussellSpitzer!

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @RussellSpitzer

@@ -586,16 +587,37 @@ private List<NestedField> reassignIds(List<NestedField> columns, TypeUtil.GetID
* @param formatVersion table format version
*/
public static void checkCompatibility(Schema schema, int formatVersion) {
// check the type in each field
// accumulate errors as a treemap to keep them in a reasonable order
Map<Integer, String> problems = Maps.newTreeMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] if we are going for another revision as fieldId() is int :

Suggested change
Map<Integer, String> problems = Maps.newTreeMap();
Map<int, String> problems = Maps.newTreeMap();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primitives are not allowed as Map Keys :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or any type parameter I think ... I feel like this is a 1337code question

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 🚀 Let's get 1.7.0 out :)

@RussellSpitzer RussellSpitzer merged commit 91e04c9 into apache:main Oct 30, 2024
50 checks passed
@RussellSpitzer
Copy link
Member

Thanks everyone - @rdblue (for original fix), @Fokko , @anuragmantri , @kevinjqliu , @singhpk234 , @amogh-jahagirdar For reviews!

zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants